Sensitivity of Predictors in Educational Data: A Bayesian Network Model

نویسندگان

  • Nicholas Misiunas
  • Miroslava Raspopovic
  • Kavitha Chandra
  • Asil Oztekin
چکیده

This research investigates the application of Bayesian Networks to predict causal relationships in a dataset that captures several demographic and academic features of a group of students from a four-year public university. This educational dataset is characterized by both quantitative and qualitative variables, some of which exhibit a strong pair-wise dependence. To identify this dependence, a factorial analysis of the mixed data is conducted that allows consideration of both variable types to result in a new coordinate space that captures the variance of the data with fewer dimensions. This exploratory stage enables visualization of groups of dependent variables that may be applied for predicting outcomes of interest. It also provides a validation of the results of the Bayesian network (BN) structure modeling. The BN is learnt using bootstrapped arc strength averaging to derive a graphical relationship between variables with arcs represented by a persistence parameter representative of their occurrence in the learning process. The resulting network is shown to be characterized by two major relatively independent structures; one formed by college academic performance metrics and the second by financial, housing and student demographic variables. The prediction accuracy of the BN is evaluated using evidence from pre-college and on-going college variables. The pre-college evidence is found not to be sensitive to college degree completion outcome with only 55% accuracy rate. The on-going college evidence however, improves the prediction accuracy by 75%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Provide a Predictive Model to Identify People with Diabetes Using the Decision Tree

Background: Today, in most hospitals in Iran, there is an extensive database of patient characteristics that includes a large amount of information related to medical, family and medical records. Finding a knowledge model of this information can help to predict the performance of the medical system and improve educational processes. Methods: Data mining techniques are analytical tools that are...

متن کامل

Risk Analysis of Operating Room Using the Fuzzy Bayesian Network Model

To enhance Patient’s safety, we need effective methods for risk management. This work aims to propose an integrated approach to risk management for a hospital system. To improve patient’s safety, we should develop flexible methods where different aspects of risk and type of information are taken into consideration. This paper proposes a fuzzy Bayesian network to model and analyze risk in the op...

متن کامل

A Bayesian Approach to Estimate Parameters of a Random Coefficient Transition Binary Logistic Model with Non-monotone Missing Pattern and some Sensitivity Analyses

‎A transition binary logistic model with random coefficients is‎ ‎proposed to model the unemployment statues of household members in‎ ‎two seasons of spring and summer‎. ‎Data correspond to the labor‎ ‎force survey performed by Statistical Center of Iran in 2006.‎ ‎This model is introduced to take into account two kinds of‎ ‎correlation in the data one due to the longitudinal nature o...

متن کامل

Assessment of Artificial Neural Network Models and Maximum Entropy in Zoning of Gully Erosion Sensitivity of Golestan Dam Basin

Zoning of gully erosion susceptibility and determining the factors controlling gully erosion is very important and vital. The aim of this study was to investigate the spatial distribution of gully erosion using two models of ANN and MaxEnt and to determine the factors affecting this type of erosion in Golestan Dam basin. Therefore, 14 factors in the form of three divisions, including topographi...

متن کامل

تعیین عوامل خطرزا و ارایه مدل پیش‌آگهی آمبولی ریه بیماران بستری با استفاده از شبکه‌های بیزی

Background and Objectives: Pulmonary embolism is a potentially fatal and prevalent event that has led to a gradual increase in the number of hospitalizations in recent years. For this reason, it is one of the most challenging diseases for physicians. The main purpose of this paper was to report a research project to compare different data mining algorithms to select the most accurate model for ...

متن کامل

A Model for Tax Evasion Forcasting based on ID3 Algorithm and Bayesian Network

Nowadays, knowledge is a valuable and strategic source as well as an asset for evaluation and forecasting. Presenting these strategies in discovering corporate tax evasion has become an important topic today and various solutions have been proposed. In the past, various approaches to identify tax evasion and the like have been presented, but these methods have not been very accurate and the ove...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015